Distortion discriminant analysis for audio fingerprinting

نویسندگان

  • Christopher J. C. Burges
  • John C. Platt
  • Soumya Jana
چکیده

Mapping audio data to feature vectors for the classification, retrieval or identification tasks presents four principal challenges. The dimensionality of the input must be significantly reduced; the resulting features must be robust to likely distortions of the input; the features must be informative for the task at hand; and the feature extraction operation must be computationally efficient. In this paper, we propose Distortion Discriminant Analysis (DDA), which fulfills all four of these requirements. DDA constructs a linear, convolutional neural network out of layers, each of which performs an oriented PCA dimensional reduction. We demonstrate the effectiveness of DDA on two audio fingerprinting tasks: searching for 500 audio clips in 36 hours of audio test data; and playing over 10 days of audio against a database with approximately 240,000 fingerprints. We show that the system is robust to kinds of noise that are not present in the training procedure. In the large test, the system gives a false positive rate of 1:5 10 8 per audio clip, per fingerprint, at a false negative rate of 0.2% per clip.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Locally most-powerful detector for secret key estimation in spread spectrum image steganography

Kernel Fisher discriminant for steganalysis of JPEG hiding methods p. 13 On estimation of secret message length in LSB steganography in spatial domain p. 23 Steganalysis using color wavelet statistics and one-class support vector machines p. 35 Steganalysis using modified pixel comparison and complexity measure p. 46 Performance evaluation of blind steganalysis classifiers p. 58 Searching for t...

متن کامل

Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review

An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of ...

متن کامل

Fast Hamming Space Search for Audio Fingerprinting Systems

In music information retrieval, a huge search space has to be explored because a query audio clip can start at any position of any music in the database, and also a query is often corrupted by significant noise and distortion. Audio fingerprints have recently attracted much attention in music information retrieval, for they provide a compact representation of the perceptually relevant parts of ...

متن کامل

Discriminative analysis of distortion sequences in speech recognition

In a traditional speech recognition system, the distance score between a test token and a reference pattern is obtained by simply averaging the distortion sequence resulted from matching of the two patterns through a dynamic programming procedure. The final decision is made by choosing the one with the minimal average distance score. If we view the distortion sequence as a form of observed feat...

متن کامل

A Framework for Robust Audio Fingerprinting

We present a framework for audio fingerprinting, rather general in its essence, but especially tuned for being used in the context of broadcast monitoring. We efficiently implemented a robust fingerprinting algorithm and a suitable retrieval method. Ample sections are devoted to strategies for improving both the reliability and the speed of the overall system. The outcomes of plentiful experime...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 11  شماره 

صفحات  -

تاریخ انتشار 2003